Ambient air pollutants and breast cancer stage in Tehran, Iran

This study aimed to examine the impacts of single and multiple air pollutants (AP) on the severity of breast cancer (BC). Data of 1148 diagnosed BC cases (2008–2016) were obtained from the Cancer Research Center and private oncologist offices in Tehran, Iran. Ambient PM10, SO2, NO, NO2, NOX, benzene, toluene, ethylbenzene, m-xylene, p-xylene, o-xylene, and BTEX data were obtained from previously developed land use regression models. Associations between pollutants and stage of BC were assessed by multinomial logistic regression models. An increase of 10 μg/m3 in ethylbenzene, o-xylene, m-xylene, and 10 ppb of NO corresponded to 10.41 (95% CI 1.32–82.41), 4.07 (1.46–11.33), 2.89 (1.08–7.73) and 1.08 (1.00–1.15) increase in the odds of stage I versus non-invasive BC, respectively. Benzene (OR, odds ratio = 1.16, 95% CI 1.01–1.33) and o-xylene (OR = 1.18, 1.02–1.38) were associated with increased odds of incidence of BC stages III & IV versus non-invasive stages. BC stage I and stage III&IV in women living in low SES areas was associated with significantly higher levels of benzene, ethylbenzene, o-xylene, and m-xylene. The highest multiple-air-pollutants quartile was associated with a higher odds of stage I BC (OR = 3.16) in patients under 50 years old. This study provides evidence that exposure to AP is associated with increased BC stage at diagnosis, especially under premenopause age.

NO 2  Nitrogen dioxide NOx Oxides of nitrogen PM 10  Particulate matter smaller than 10 microns Ambient air pollution (AAP) is a complex mixture of various gaseous pollutants and solid particles 1,2 .According to the World Health Organization (WHO) Ambient Air Quality Database 2022, more than 80% of populations living in urban areas with air monitoring devices, are exposed to air quality levels exceeding WHO thresholds 3 .AAP is estimated to have caused 4.2 million premature deaths globally in 2019 4 .Many studies indicate that AAP can increase the risk of myocardial infarction 5 , stroke 6 , headache [7][8][9][10] , disorders in fetal development 11 , asthma 12 , Chronic obstructive pulmonary disease (COPD) 2 , Attention deficit hyperactivity disorder (ADHD) in children 13 , and neurological diseases 14 .Furthermore, in 2013, AAP has been classified by the International Agency for Research on Cancer (IARC) as a Group 1 human carcinogen, mostly due to the evidence which related it to lung cancer 15 .
According to the report of the IARC in 2020, breast cancer (BC) has now become the most prevalent cancer worldwide, surpassing lung cancer 16 .It is also one of the leading causes of death globally 17,18 and its burden has been increasing in many parts of the world over the past decades 19 .In 2020, 2.3 million women worldwide were diagnosed with BC, which corresponds to 1 in 8 diagnosed cancers; and 685,000 deaths occurred, which accounts for 1 in 6 deaths due to cancer in women 18,20,21 .By the end of 2020, there were 7.8 million women diagnosed with BC in the recent 5 years, making it the world's most common cancer 21 .It is predicted that by 2040, the burden of BC will increase to over 3 million morbidities and 1 million mortalities annually 19 .
BC is a complex multifactorial disease with a couple of known risk factors including female gender, older age, history of BC in the family, overweight and obesity, physical inactivity, history of radiation exposure, reproductive history (such as early menarche and late first pregnancy), tobacco and alcohol use, history of other benign breast diseases, short breastfeeding periods, and postmenopausal hormone therapy or oral contraceptives 21,22 .However, almost half of diagnosed breast cancers in women have no detectable risk factor 21 , suggesting a need to identify still unknown risk factors.
There is growing evidence that AAP can be a risk factor for BC 23,24 .Ecological studies propose that BC risk is higher in urban areas with higher air pollution compared to rural areas 25 .AAP contains many carcinogens that may perform as endocrine disruptors and cause oxidative DNA-damage which may affect BC risk [25][26][27] .However, studies have indicated inconsistent results about the impact of AAP on BC 23,24,[28][29][30] .Uncertainty about the effect of AAP on BC is because it is difficult to prove causality due to the long latent period as well as low-dose exposure in the environment 31 .Meanwhile, the dissimilar findings of published studies can be partly justified by the diversity in AAP and exposure measurement methods and variations in study design 32 .Therefore, as Wei et al. 2021 recommended, there is a need to conduct studies especially in developing countries, with improved exposure measurement and covariate adjustments 33 .
In this regard, many researchers have explored a multiple-pollutant (instead of single pollutant) approach to evaluate the effects of air pollution, because humans are usually exposed to a complex mixture of air pollutants 34 , and in models that assess the impact of a single pollutant, it is difficult to determine whether an observed association reflects the impact of the specific pollutant being investigated, or the effect of other pollutants coinciding with it 35 .Although a few studies have shown the effect of air pollution on the incidence of BC 36 , but there is limited information about its effect on the severity of the disease.Therefore, the objective of this study was to investigate the effect of multiple air pollutants on BC stages diagnosed in Tehran, Iran.Also, as a secondary objective we assessed the pollutants' impact on BC stages in different socio-economic levels.

Study subjects
Data was inquired from the Cancer Research Center (CRC) of Shahid Beheshti University of Medical Sciences in Tehran.All eligible female patients diagnosed with BC (ICD-O-3 C50.0-C50.9)according to pathology report, between 2008 and 2016 in different districts of Tehran were included.The Institutional Review Board (IRB) of the CRC approved the study protocol.
Data about patients' characteristics including demographic factors (age at diagnosis, education level, and marital status), lifestyle factors (smoking status), reproductive factors (ages at first menstruation and pregnancy, number of pregnancies and deliveries), Estrogen/Progesterone receptor status, and clinical pathologic information including stage at diagnosis (non-aggressive, stage I, II, or III&IV), number of metastatic lymph nodes, family history of BC and diabetes was available for each patient.The frequency of missing data for all variables was low (≤ 5.5%).
Ethical approval was obtained (Code: IR.SBMU.CRC.REC.1400.008)from the Ethics Committee of the CRC of Shahid Beheshti University of Medical Sciences in Tehran, Iran, and all methods were performed under the relevant guidelines and regulations.

Residence, highway proximity, and neighbourhood socioeconomic status
Residential addresses at diagnosis were geocoded to latitude and longitude coordinates using address or street locators.
An index of socioeconomic status (SES) was created based on principal component analysis of sixteen district -based indicators of SES.The socioeconomic indicators of the 22 districts of Tehran were extracted from a local study 37 .This SES index was assigned to participants' addresses at diagnosis and was categorized into quartiles.
The distance of each patient to the highway was assigned as a proxy for traffic-related exposures.The distance of each patient's address at diagnosis to the nearest main street or highway was calculated in the ESRI 2016 data layer.Distance to the highway was categorized in 3 groups as < 400, 400 to 800, and > 800 m.

Long-term air pollution exposure assessment
Land use regression (LUR) models, were used to estimate exposure levels of PM 10 (Particulate matter smaller than 10 microns), SO 2 (sulphur dioxide), NO (nitric oxide), NO 2 (nitrogen dioxide) , and NO X (oxides of nitrogen) based on measurements conducted at 23 regulatory network monitoring sites in Tehran, in 2010 38,39 .The volatile organic compound (VOC) concentration levels were obtained from spatial models that were built using long term measurements across about 180 sites in Tehran with very good performance 40 .More details about the exposure assessment methods have been described elsewhere 41 .Based on the patient's geocoded residential locations at the time of diagnosis of breast cancer, air pollution exposure was estimated for each patient.
The ArcGIS Software (ArcGIS Locator version 10.0, ESRI, Redlands, CA, USA) and Tehran ArcGIS Shapefile Map Layers were used to geocode the residential addresses of the study subjects (X and Y coordinates of addresses).

Statistical analyses
Data were summarized with mean ± standard deviation (SD) for continuous and frequency (percentage) for categorical variables.Chi-square tests was used to assess the difference among categorical variables in different categories of BC.Kolmogorov-Smirnov test was used to test the normality of the pollutants data and because the data were not normally distributed, Spearman's correlation test was used to examine the correlations between different air pollutants and SES status.
The statistical analysis consisted of three steps: at first, we applied the weighted quantile sum (WQS) regression analysis to estimate the joint effect of air pollution mixtures on the stage of BC; in the second stage, we estimated the effect of each pollution and multipollutants on the stage of BC.Associations with BC risk were modeled using multinomial logistic regression models to estimate odds ratios (OR) and 95% confidence intervals (95% CI).Finally, the effect of each air pollutant on the stage of BC in different levels of SES was estimated.
The weighted quantile sum (WQS) regression analysis was done using the R package "gWQS" and Quantilebased g-Computation estimates was done suing the "qgcomp" package in R software.The WQS, developed specifically for the context of environmental mixtures analysis, is an increasingly common approach for multivariate regression in a high-dimensional dataset that operates in a supervised framework, creating a single score (the weighted quantile sum) that summarizes the overall exposure to the mixture, and by including this score in a regression model to evaluate the overall effect of the mixture on the outcome of interest 42 .The score is calculated as a weighted sum (so that exposures with weaker effects on the outcome have lower weight in the index) of all exposures categorized into quartiles (or more groups) so that extreme values have less impact on the weight estimation.A recent approach introduced by Keil et al. (2020) called Quantile-based g-Computation estimates the overall mixture effect with the same procedure used by WQS, but estimates the parameters of a marginal structural model, rather than the standard regression used in this study 43 .This approach, is under the common assumptions in causal inference such as exchangeability, causal consistency, positivity, no interference, and correct model specification.This model also improves the causal interpretation of the overall effect 44 .
After that, bivariate multinomial logistic regression analysis was conducted to explore the association of independent variables and BC stages.The cancer stage variable (outcome) had four categories: non-aggressive, stage I, stage II, stage III & IV, in which non-aggressive was the reference category.Thus, each pollutant and all confounder variables (with a P-value < 0.2 in bivariate analysis) were modeled by multivariate-adjusted multinomial logistic regression analysis.We also stratified the models by age (≥ 50 years old as menopause and < 50 years as pre-menopause).
Because the analysis examined the relations between BC stage and numerous correlated air pollutants, we used two different methods for parameterizing air pollutants in our study, single pollutant and multipollutant.The lowest-quantile category of multipollutants was used as the reference for comparison.
Finally, to understand differences in air pollutant and BC stage associations by SES, we assessed this association in low (quartile 1 and 2) and high (quartile 3 and 4) SES levels.Since data were from 22 districts, robust standard errors by cluster were incorporated into all analyses.Missing data were replaced by the variables' mode or median value.Data description and analyses were conducted using STATA 17 and statistical software R (version 4.0.2,License GPLv2).

Ethical approval and consent to participate
Participants in all studies provided written informed consent.Ethical approval was obtained (IR.SBMU.CRC.REC.1400.008)from the Ethics Committee of the Cancer Research Center (CRC) of Shahid Beheshti University of Medical Sciences, Tehran, Iran.The patient data was anonymous and strictly confidential.

Study population characteristic
The study population consisted of 1164 BC cases aged 20 years and older residing in 22 urban districts of Tehran during 2008-2016.We had to exclude the data of 16 subjects who lived in remote suburbs of Tehran, which air pollutant surveillance was not done.Finally, 1148 cases entered the analyses.
The distribution of BC patients in different regions of Tehran is shown in Fig. BC, diabetes, smoking, pregnancy status, and ER-PR status among patients diagnosed at different stages of BC (Table 1).The distribution of BC patients in different categories is shown in Table 1.The most common stage of BC was stage 2 with 35.6%, followed by stage III & IV with 31.40% of cases (Fig. 2).
Figure 4a shows the contribution of each pollutant to the construction of the composite variable (multipollution) used in the multipollutant model.In this figure, the red dotted line indicates the significance level, and according to the concentration and pollutant weights, the contributions of ethylbenzene, NO 2 , and benzene were relatively small, while the contribution of PM 10 , p-xylene, o-xylene, and NO were relatively prominent in the multipollution variable.Furthermore, Fig. 4b shows the positive and negative weight of variables in the construction of the multipollution variable.Also, of the four important and influential variables, p-xylene and NO variables had positive weights and o-xylene and PM 10 variables had negative weights.
Summary statistics for each pollutant, multipollution variable, and proximity to the highway of BC cases are shown in Table 2.

Severity of BC and ambient air pollutants analysis
Table 3 shows the Odds ratios (OR) and 95% confidence intervals (95% CI) of the crude analyses of the associations between the independent variables and cancer stages by multinomial logistic regression analyses.Education level, smoking status, diabetes, family history of BC, age at first menstruation, number of pregnancies and deliveries, and highway proximity were included in the multivariate model, because, at least one of their categories had a P-value < 0.2.
In patients under 50 years old, in multi-pollutant models, the high multiple-air-pollution quartile was associated with higher odds of stage I BC (OR = 3.16, 95% CI 1.17-8.53)when compared with the low multipleair-pollution quartile (Table 4).Furthermore, our results showed that the adjusted odds of BC stage I and stage III&IV, and air pollution exposure was higher among low SES cases (stages III & IV vs non-aggressive: OR O-xylene = 2.69 and OR M-xylene = 1.83, stages I vs non-aggressive: OR Benzene = 3.67, OR Ethylbenzene = 7.15, and OR O-xylene = 2.49) (Table 5).

Discussion
To our best knowledge, this is the first study to examine the association between air pollutants (single and mixtures) and BC severity.The findings suggest that air pollutant exposure, especially in patients who were diagnosed under 50 years old was associated with a higher stage of BC at diagnosis.This might mean that air pollutants are increasing the speed of breast cancer development and progress.Similar to various other studies 45 , the spatial correlations between the individual air pollutants were rather high (− 0.51 to 0.96); however our study used a new approach and investigated the effect of a mixture of air pollutants as well.The interest in determining the simultaneous effect of multiple pollutant exposure on health outcomes, and the identification of dominant pollutants has been growing in recent years.These studies can probably explain the health outcomes much better than single pollutant studies [45][46][47][48] .
A recent approach introduced by Keil et al. (2020) called Quantile-based g-Computation gWQS estimates the overall mixture effect with the same procedure used by WQS, but estimates the parameters of a marginal structural model, rather than the standard regression.This model also improves the causal interpretation of the overall effect of multiple pollutants 40 .It combines pollutants into a weighted additive index, which is used to estimate an overall mixture effect through a bootstrap resampling procedure and avoids overfitting and collinearity 42 .This model has been used in an increasing number of studies 42,49,50  Our results showed a significant association between air pollution and BC severity by adjusting for smoking status, diabetes, family history of BC, age at first menstruation, number of pregnancies, and highway proximity.Our study demonstrated some associations between ethylbenzene, o-xylene, m-xylene, and NO and stage of BC among women under 50 years old and o-xylene and benzene among women over 50 years old in univariate models.Previous studies also suggest air pollution might be related to breast cancer, particularly among women with a positive family history and age of under 50 years old 32,51 .
One meta-analysis of 36 effect estimates for PM 2.5 , PM 10 , and NO 2 has confirmed that decreasing longterm NO 2 exposure or correlated air pollutant exposures could lower breast cancer risk; and also showed that associations of NO 2 levels with breast cancer risk were higher in premenopausal than in postmenopausal women 52 .In this current study, a significant association was seen between air pollutant exposures and severity of BC in premenopausal women.This shows that the effect of air pollutants on BC can be different in different periods of a women's life and may be stronger during premenopause.Differences in cancer morphology or hormonal subtypes in pre-and post-menopausal women might explain this difference in the effect of air pollutants 53 .Some studies found positive associations between air pollution and BC in postmenopausal women 54,55 .www.nature.com/scientificreports/Two recent reviews suggested an significant increased risk of breast cancer associated with an increase in nitrogen dioxide (NO 2 ) and nitrogen oxide (NOx) levels, both of which are proxies for traffic exposure 56,57 .Also a nested case-control study within the French E3N cohort showed an increased odds of breast cancer associated with long-term exposure to NO 2 air pollution 54 .
Hwang et al. in a nationwide analysis in South Korea (2005-2016) showed that the ambient air pollutant concentrations were positively and signifcantly associated with breast cancer odds, and per 10 ppb NO 2 increase, the odds of BC increased by OR = 1.14 (95% CI = 1.12-1.16) 58.A cohort study conducted between 1980 and 1985 in urban centers in Canada, showed that exposure to NO 2 increases the risk of premenopausal breast cancer, and the rate ratio (RR) for an increase of 9.7 ppb (the interquartile range) was 1.13 (95% CI 0.94-1.37)among premenopausal patients 59 .
Studies have shown that NO can directly inhibit the activity of caspases providing an efficient means to block apoptosis and can increase breast cancer development through estrogen and progesterone pathways, which are both involved in the carcinogenesis of breast cancer 60 .
Results of a review study in 2018 showed that many individual air pollutants are genotoxic and some are estrogenic or anti-estrogenic.The polycyclic aromatic hydrocarbons (PAHs) are the most-studied component of air pollution in relation to breast cancer and include hundreds of compounds and their metabolites with different biologic activities which are thought to specifically caused mammary gland tumors 36 .PAHs also activate CYP3A4 via PXR receptors, and can affect estrogen metabolism through these routes as well 61 .The role of PAHs in tumor progression, has been suggested in some studies 62 , and the results from other studies done in different geographic locations showed that some VOCs are human carcinogens with strong evidence for genotoxicity, increased PAH-DNA adducts, TP53 polymorphisms and mutations 36 .
Although a nationwide analysis in South Korea showed that SO 2 concentrations were positively and significantly associated with the odds of breast cancer (per 1 ppb SO 2 , OR = 1.04, 95% CI = 1.02-1.05) 58, we found inverse effects of SO 2 on the severity of breast cancer among women over 50 years old.
In this study, we found negative and statistically significant correlations between air pollution and SES level.Interestingly, significant associations were seen between air pollution and BC severity in low SES regions as well.The high correlation between SES and air pollutants suggests that part of the effect of air pollutants on BC maybe explained by low SES.Population features, neighborhood deprivation, and air pollution levels are often interconnected, although the direction of associations maybe different in different areas 63 .Recent studies suggest this pattern could be linked to the composition of the air pollution mixture or the intrinsic vulnerability of the population 52 .A recent Multiethnic Cohort (MEC) among African American, European American, Japanese American, and Latina American women diagnosed with breast cancer reported the harmful impact of air pollutants on breast cancer survival, and that this association may be confounded by socioeconomic factors 64 .One American study reported that the worst socio-demographic environmental quality, increased the odds of distant metastatic breast cancer by 10% in non-metro-urbanized counties (OR 1.10; 95% CI 1.00-1.20,P = 0.035) 65 .
In this study, breast cancer patients data was collected from the Shahid Beheshti University of Medical Sciences Cancer Research Center and oncologists private offices, which included patients from all areas of Tehran.But it did not cover all the patients in Tehran.
Up to our knowledge, this is the largest study to date to examine the association between air pollutants and BC in Iran and the first study to explore the associations between BC severity and single and multiple air pollutants.
This study has the advantage of using WQS inference and the flexible g-computation method which allows to explore the nonlinear and non-additive effects of individual pollutants and their mixture as a whole.Quantile g-computation is able to estimate the parameters of a marginal structural model 43 .In the grouped weighted quantile sum (GWQS), multiple groups of pollutants are allowed to be included in the GWQS regression model, and the components of the multi-pollutant mixture are allowed to have different magnitudes and directions 42 .
In this study confounders were controlled for in the analysis; however, additional information on some potentially important individual confounders, such as genetic predisposition of patients for certain cancer types, diet, physical activity, and exposure to indoor pollutants were not available, and this could have led to residual confounding.
Another limitation of this study was that we used exposure data gathered at a single point in time analyzed with LUR models to estimate the concentration of air pollutants in the long run.Nevetheless, the temporal stability of these models for traffic-related air pollution has been shown in studies.Researchers have commented that LUR models were able to provide reliable estimates for a period of 7 years in Vancouver 66 .
We had no information about the residential history of patients which could have confounded our analyses.Additional studies are needed to determine the effect of relocation on air pollution exposure and the incidence and severity of BC.
Another limitation of this study was that we tested multiple hypothesis with a type I error equal to 0.05, and some of these comparison might have become significant by chance.

Conclusion
In summary, we found substantial evidence that higher air pollutants particularly NO, ethylbenzene, o-xylene, m-xylene, and benzene in outdoor air were associated with increased odds of BC stages at diagnosis.Furthermore, the association between air pollutants and BC severity appeared higher in premenopausal women.Our work has implications for future environmental justice studies investigating the influence of SES on the association between air pollutants and BC.Additionally, more research on this association will improve our understanding of the mechanisms underlying the role of air pollutants on the severity of BC.

Figure 1 .
Figure 1.Spatial distribution of breast cancer patients in different areas of Tehran in 2008-2016 (n = 1148).

Figure 4 .
Figure 4. Distribution of (a) quantile sum regression model index weights and (b) positive and negative weights estimation in the simulated dataset of multipollutant percentile quartile for each air pollution.

Table 1 .
. Our study indicated that the dominant pollutants in the gWQS model were p-xylene, NO, o-xylene, and PM 10 .The demographic and clinical characteristics of women diagnosed with breast cancer in different areas of Tehran in 2008-2016.BC breast cancer, PR Progesterone receptor, ER Estrogen receptor.Numbers may not total to 100% due to missing data.
Figure 2. Breast cancer severity status among women diagnosed with breast cancer (n = 1148).

Table 2 .
Summary of air pollution and highway proximity variables.SD standard deviation, Min Minimum, Max Maximum, IQR Interquartile range.

Table 3 .
Crude Odds ratio (OR) between independent variables and breast cancer stages.BC breast cancer, PR Progesterone receptor, ER Estrogen receptor.*P<0.05.**TheOddsratio is estimated for each 10 unit increase in pollutants.†Thenumbers in the subgroups were too low to calculate the OR.

Table 4 .
Adjusted Odds ratios and 95% confidence intervals between each 10-unit increase in air pollutants and breast cancer stages.Reference group: non-aggressive stage of breast cancer.Adjusted for education level, smoking status, diabetes, family history of BC, age at first menstruation, number of pregnancies, and highway proximity.*P < 0.05, **P < 0.01.

Table 5 .
Adjusted Odds ratios and 95% confidence intervals of the association between each 10 unit increase in air pollutants and stage of cancers in SES categories.Reference group: non-aggressive stage of breast cancer.Adjusted for education level, smoking status, diabetes, family history of BC, number of pregnancies, and highway proximity.BC breast cancer, SES socioeconomic status.*P ≤ 0.05.